838 research outputs found

    Why We Read Wikipedia

    Get PDF
    Wikipedia is one of the most popular sites on the Web, with millions of users relying on it to satisfy a broad range of information needs every day. Although it is crucial to understand what exactly these needs are in order to be able to meet them, little is currently known about why users visit Wikipedia. The goal of this paper is to fill this gap by combining a survey of Wikipedia readers with a log-based analysis of user activity. Based on an initial series of user surveys, we build a taxonomy of Wikipedia use cases along several dimensions, capturing users' motivations to visit Wikipedia, the depth of knowledge they are seeking, and their knowledge of the topic of interest prior to visiting Wikipedia. Then, we quantify the prevalence of these use cases via a large-scale user survey conducted on live Wikipedia with almost 30,000 responses. Our analyses highlight the variety of factors driving users to Wikipedia, such as current events, media coverage of a topic, personal curiosity, work or school assignments, or boredom. Finally, we match survey responses to the respondents' digital traces in Wikipedia's server logs, enabling the discovery of behavioral patterns associated with specific use cases. For instance, we observe long and fast-paced page sequences across topics for users who are bored or exploring randomly, whereas those using Wikipedia for work or school spend more time on individual articles focused on topics such as science. Our findings advance our understanding of reader motivations and behavior on Wikipedia and can have implications for developers aiming to improve Wikipedia's user experience, editors striving to cater to their readers' needs, third-party services (such as search engines) providing access to Wikipedia content, and researchers aiming to build tools such as recommendation engines.Comment: Published in WWW'17; v2 fixes caption of Table

    Detection of lipoarabinomannan (LAM) in urine is an independent predictor of mortality risk in patients receiving treatment for HIV-associated tuberculosis in sub-Saharan Africa: a systematic review and meta-analysis

    Get PDF
    BackgroundSimple immune capture assays that detect mycobacterial lipoarabinomannan (LAM) antigen in urine are promising new tools for the diagnosis of HIV-associated tuberculosis (HIV-TB). In addition, however, recent prospective cohort studies of patients with HIV-TB have demonstrated associations between LAM in the urine and increased mortality risk during TB treatment, indicating an additional utility of urinary LAM as a prognostic marker. We conducted a systematic review and meta-analysis to summarise the evidence concerning the strength of this relationship in adults with HIV-TB in sub-Saharan Africa, thereby quantifying the assay’s prognostic value.MethodsWe searched MEDLINE and Embase databases using comprehensive search terms for ‘HIV’, ‘TB’, ‘LAM’ and ‘sub-Saharan Africa’. Identified studies were reviewed and selected according to predefined criteria.ResultsWe identified 10 studies eligible for inclusion in this systematic review, reporting on a total of 1172 HIV-TB cases. Of these, 512 patients (44%) tested positive for urinary LAM. After a variable duration of follow-up of between 2 and 6months, overall case fatality rates among HIV-TB cases varied between 7% and 53%. Pooled summary estimates generated by random-effects meta-analysis showed a two-fold increased risk of mortality for urinary LAM-positive HIV-TB cases compared to urinary LAM-negative HIV-TB cases (relative risk 2.3, 95% confidence interval 1.6–3.1). Some heterogeneity was explained by study setting and patient population in sub-group analyses. Five studies also reported multivariable analyses of risk factors for mortality, and pooled summary estimates demonstrated over two-fold increased mortality risk (odds ratio 2.5, 95% confidence interval 1.4–4.5) among urinary LAM-positive HIV-TB cases, even after adjustment for other risk factors for mortality, including CD4 cell count.ConclusionsWe have demonstrated that detectable LAM in urine is associated with increased risk of mortality during TB treatment, and that this relationship remains after adjusting for other risk factors for mortality. This may simply be due to a positive test for urinary LAM serving as a marker of higher mycobacterial load and greater disease dissemination and severity. Alternatively, LAM antigen may directly compromise host immune responses through its known immunomodulatory effects. Detectable LAM in the urine is an independent risk factor for mortality among patients receiving treatment for HIV-TB. Further research is warranted to elucidate the underlying mechanisms and to determine whether this vulnerable patient population may benefit from adjunctive interventions.Electronic supplementary materialThe online version of this article (doi:10.1186/s12916-016-0603-9) contains supplementary material, which is available to authorized users

    Asynchronous Training of Word Embeddings for Large Text Corpora

    Full text link
    Word embeddings are a powerful approach for analyzing language and have been widely popular in numerous tasks in information retrieval and text mining. Training embeddings over huge corpora is computationally expensive because the input is typically sequentially processed and parameters are synchronously updated. Distributed architectures for asynchronous training that have been proposed either focus on scaling vocabulary sizes and dimensionality or suffer from expensive synchronization latencies. In this paper, we propose a scalable approach to train word embeddings by partitioning the input space instead in order to scale to massive text corpora while not sacrificing the performance of the embeddings. Our training procedure does not involve any parameter synchronization except a final sub-model merge phase that typically executes in a few minutes. Our distributed training scales seamlessly to large corpus sizes and we get comparable and sometimes even up to 45% performance improvement in a variety of NLP benchmarks using models trained by our distributed procedure which requires 1/101/10 of the time taken by the baseline approach. Finally we also show that we are robust to missing words in sub-models and are able to effectively reconstruct word representations.Comment: This paper contains 9 pages and has been accepted in the WSDM201

    Wavelet Based Fractal Analysis of Airborne Pollen

    Full text link
    The most abundant biological particles in the atmosphere are pollen grains and spores. Self protection of pollen allergy is possible through the information of future pollen contents in the air. In spite of the importance of airborne pol len concentration forecasting, it has not been possible to predict the pollen concentrations with great accuracy, and about 25% of the daily pollen forecasts have resulted in failures. Previous analysis of the dynamic characteristics of atmospheric pollen time series indicate that the system can be described by a low dimensional chaotic map. We apply the wavelet transform to study the multifractal characteristics of an a irborne pollen time series. We find the persistence behaviour associated to low pollen concentration values and to the most rare events of highest pollen co ncentration values. The information and the correlation dimensions correspond to a chaotic system showing loss of information with time evolution.Comment: 11 pages, 7 figure

    A survey of location inference techniques on Twitter

    Get PDF
    The increasing popularity of the social networking service, Twitter, has made it more involved in day-to-day communications, strengthening social relationships and information dissemination. Conversations on Twitter are now being explored as indicators within early warning systems to alert of imminent natural disasters such as earthquakes and aid prompt emergency responses to crime. Producers are privileged to have limitless access to market perception from consumer comments on social media and microblogs. Targeted advertising can be made more effective based on user profile information such as demography, interests and location. While these applications have proven beneficial, the ability to effectively infer the location of Twitter users has even more immense value. However, accurately identifying where a message originated from or an author’s location remains a challenge, thus essentially driving research in that regard. In this paper, we survey a range of techniques applied to infer the location of Twitter users from inception to state of the art. We find significant improvements over time in the granularity levels and better accuracy with results driven by refinements to algorithms and inclusion of more spatial features

    Domain-independent Extraction of Scientific Concepts from Research Articles

    Get PDF
    We examine the novel task of domain-independent scientific concept extraction from abstracts of scholarly articles and present two contributions. First, we suggest a set of generic scientific concepts that have been identified in a systematic annotation process. This set of concepts is utilised to annotate a corpus of scientific abstracts from 10 domains of Science, Technology and Medicine at the phrasal level in a joint effort with domain experts. The resulting dataset is used in a set of benchmark experiments to (a) provide baseline performance for this task, (b) examine the transferability of concepts between domains. Second, we present two deep learning systems as baselines. In particular, we propose active learning to deal with different domains in our task. The experimental results show that (1) a substantial agreement is achievable by non-experts after consultation with domain experts, (2) the baseline system achieves a fairly high F1 score, (3) active learning enables us to nearly halve the amount of required training data.Comment: Accepted for publishing in 42nd European Conference on IR Research, ECIR 202

    Rapid urine-based screening for tuberculosis in HIV-positive patients admitted to hospital in Africa (STAMP): a pragmatic, multicentre, parallel-group, double-blind, randomised controlled trial.

    Get PDF
    BACKGROUND Current diagnostics for HIV-associated tuberculosis are suboptimal, with missed diagnoses contributing to high hospital mortality and approximately 374 000 annual HIV-positive deaths globally. Urine-based assays have a good diagnostic yield; therefore, we aimed to assess whether urine-based screening in HIV-positive inpatients for tuberculosis improved outcomes. METHODS We did a pragmatic, multicentre, double-blind, randomised controlled trial in two hospitals in Malawi and South Africa. We included HIV-positive medical inpatients aged 18 years or more who were not taking tuberculosis treatment. We randomly assigned patients (1:1), using a computer-generated list of random block size stratified by site, to either the standard-of-care or the intervention screening group, irrespective of symptoms or clinical presentation. Attending clinicians made decisions about care; and patients, clinicians, and the study team were masked to the group allocation. In both groups, sputum was tested using the Xpert MTB/RIF assay (Xpert; Cepheid, Sunnyvale, CA, USA). In the standard-of-care group, urine samples were not tested for tuberculosis. In the intervention group, urine was tested with the Alere Determine TB-LAM Ag (TB-LAM; Alere, Waltham, MA, USA), and Xpert assays. The primary outcome was all-cause 56-day mortality. Subgroup analyses for the primary outcome were prespecified based on baseline CD4 count, haemoglobin, clinical suspicion for tuberculosis; and by study site and calendar time. We used an intention-to-treat principle for our analyses. This trial is registered with the ISRCTN registry, number ISRCTN71603869. FINDINGS Between Oct 26, 2015, and Sept 19, 2017, we screened 4788 HIV-positive adults, of which 2600 (54%) were randomly assigned to the study groups (n=1300 for each group). 13 patients were excluded after randomisation from analysis in each group, leaving 2574 in the final intention-to-treat analysis (n=1287 in each group). At admission, 1861 patients were taking antiretroviral therapy and median CD4 count was 227 cells per μL (IQR 79-436). Mortality at 56 days was reported for 272 (21%) of 1287 patients in the standard-of-care group and 235 (18%) of 1287 in the intervention group (adjusted risk reduction [aRD] -2·8%, 95% CI -5·8 to 0·3; p=0·074). In three of the 12 prespecified, but underpowered subgroups, mortality was lower in the intervention group than in the standard-of-care group for CD4 counts less than 100 cells per μL (aRD -7·1%, 95% CI -13·7 to -0·4; p=0.036), severe anaemia (-9·0%, -16·6 to -1·3; p=0·021), and patients with clinically suspected tuberculosis (-5·7%, -10·9 to -0·5; p=0·033); with no difference by site or calendar period. Adverse events were similar in both groups. INTERPRETATION Urine-based tuberculosis screening did not reduce overall mortality in all HIV-positive inpatients, but might benefit some high-risk subgroups. Implementation could contribute towards global targets to reduce tuberculosis mortality. FUNDING Joint Global Health Trials Scheme of the Medical Research Council, the UK Department for International Development, and the Wellcome Trust

    Electrons in High-Tc Compounds: Ab-Initio Correlation Results

    Full text link
    Electronic correlations in the ground state of an idealized infinite-layer high-Tc compound are computed using the ab-initio method of local ansatz. Comparisons are made with the local-density approximation (LDA) results, and the correlation functions are analyzed in detail. These correlation functions are used to determine the effective atomic-interaction parameters for model Hamiltonians. On the resulting model, doping dependencies of the relevant correlations are investigated. Aside from the expected strong atomic correlations, particular spin correlations arise. The dominating contribution is a strong nearest neighbor correlation that is Stoner-enhanced due to the closeness of the ground state to the magnetic phase. This feature depends moderately on doping, and is absent in a single-band Hubbard model. Our calculated spin correlation function is in good qualitative agreement with that determined from the neutron scattering experiments for a metal.Comment: 21pp, 5fig, Phys. Rev. B (Oct. 98

    Communication calls produced by electrical stimulation of four structures in the guinea pig brain

    Get PDF
    One of the main central processes affecting the cortical representation of conspecific vocalizations is the collateral output from the extended motor system for call generation. Before starting to study this interaction we sought to compare the characteristics of calls produced by stimulating four different parts of the brain in guinea pigs (Cavia porcellus). By using anaesthetised animals we were able to reposition electrodes without distressing the animals. Trains of 100 electrical pulses were used to stimulate the midbrain periaqueductal grey (PAG), hypothalamus, amygdala, and anterior cingulate cortex (ACC). Each structure produced a similar range of calls, but in significantly different proportions. Two of the spontaneous calls (chirrup and purr) were never produced by electrical stimulation and although we identified versions of chutter, durr and tooth chatter, they differed significantly from our natural call templates. However, we were routinely able to elicit seven other identifiable calls. All seven calls were produced both during the 1.6 s period of stimulation and subsequently in a period which could last for more than a minute. A single stimulation site could produce four or five different calls, but the amygdala was much less likely to produce a scream, whistle or rising whistle than any of the other structures. These three high-frequency calls were more likely to be produced by females than males. There were also differences in the timing of the call production with the amygdala primarily producing calls during the electrical stimulation and the hypothalamus mainly producing calls after the electrical stimulation. For all four structures a significantly higher stimulation current was required in males than females. We conclude that all four structures can be stimulated to produce fictive vocalizations that should be useful in studying the relationship between the vocal motor system and cortical sensory representation
    • …
    corecore